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Abstract 



We study the relation between the minimal spanning tree (MST) on many random points 
and the "near-minimal" tree which is optimal subject to the constraint that a proportion 5 
of its edges must be different from those of the MST. Heuristics suggest that, regardless of 
details of the probability model, the ratio of lengths should scale as 1 + 9((5^). We prove 
this scaling result in the model of the lattice with random edge-lengths and in the Euclidean 
model. 

Keywords: combinatorial optimization; continuum percolation; disordered lattice; local weak 
convergence; minimal spanning tree; Poisson point process; probabilistic analysis of algorithms; 
random geometric graph 

Mathematical subject codes: 05C80; 60K35; 68W40 

1 Introduction 

This paper gives details of one aspect of the following broad project [Ij. Freshman calculus 
tells us how to find a minimum x=k of a smooth function f{x): set the derivative /'(x*) = 
and check /"(x*) > 0. The related series expansion tells us, for points x near to x*, how 
the distance 6 = |x — x^=| relates to the difference e = f{x) — f{x^) in /-values: e scales as 
5^. This scaling exponent 2 persists for functions / : M'^ ^ M: if x* is a local minimum and 
e[5) := min{/(x) — /(x*) : |x — x=k| = 5), then e{5) scales as 5"^ for a generic smooth function /. 

Combinatorial optimization, exemplified by the traveling salesman problem (TSP), is tradi- 
tionally viewed as a quite distinct subject, with theoretical analysis focussed on the number of 
steps that algorithms require to find the optimal solution. To make a connection with calculus, 
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compare an arbitrary tour x through n points with the optimal (minimum-length) tour x^, by 
considering the two quantities 

5„(x) = {number of edges in x but not in x*}/n 
e„(x) = {length difference between x and x*}/s(n) 

where s(n) is the length of the minimum length tour. Now define Eni^) to be the minimum 
value of eri(x) over all tours x for which (5„(x) > 5. Although the function e„((5) will depend 
on n and the problem instance, we anticipate that for typical instances drawn from a suitable 
probability model it will converge in the n ^ oo limit to some deterministic function £{S). The 
universality paradigm from statistical physics [8j suggests there might be a scaling exponent a 
defined by 

e{6) ~ 5" as ^ 

and that the exponent should be robust under model details. 

There is fairly strong evidence [Ij that for TSP the scaling exponent is 3. This is based 
on analytic methods in a mean-field model of interpoint distances (distances between pairs of 
points are random, independent for different pairs, thus ignoring geometric constraints) and 
on Monte Carlo simulations for random points in 2, 3 and 4 dimensional space. The analytic 
results build upon a recent probabilistic reinterpretation [2j of work of Krauth and Mezard 
[9] establishing the average length of mean-field TSP tours. But neither part of these TSP 
assertions is rigorous, and indeed rigorous proofs in d dimensions seem far out of reach of 
current methodology. In contrast, for the minimum spanning tree (MST) problem, a standard 
algorithmically easy problem, a simple heuristic argument (section II. 2p strongly suggests that 
the scaling exponent is 2 for any reasonable probability model. The goal of this paper is to work 
through the details of a rigorous proof. 

Why study such scaling exponents? For a combinatorial optimization problem, a larger 
exponent means that there are more near-optimal solutions, suggesting that the algorithmic 
problem of finding the optimal solution is intrinsically harder. So scaling exponents may serve 
to separate combinatorial optimization problems of an appropriate type into a small set of classes 
of increasing difficulty. For instance, the minimum matching and minimum Steiner tree problems 
are expected to have scaling exponent 3, and thus be in the same class as TSP in a quantitative 
way, as distinct from their qualitative similarity as NP-complete problems under worst-case 
inputs. In contrast, algorithmically easy problems are expected to have scaling exponent 2, 
analogously to the "calculus" scaling exponent. One plausible explanation is that the near- 
optimal solutions in such problems differ from the optimal solution via only "local changes", 
each local change affecting only a number of edges which remains 0(1) as 5 — s- 0. 

1.1 Background 

Steele and Yukich [13] give general background concerning combinatorial optimization over 
random points. 

A network is a graph whose edges e have positive real lengths len(e). Let G be a finite 
connected network. Recall the notion of a spanning tree (ST) T in G. Identifying T as a set of 
edges, write len(T) = ^egj'len(e). A minimal spanning tree (MST) is a ST of minimal length; 
such a tree always exists but may not be unique. The classical greedy algorithm (Kruskal's 
algorithm [7\) for constructing a MST yields two fundamental properties which we record without 
proof in Lemma [TJ 

Let Gt be the subnetwork consisting of those edges e of G with len(e) < t. For arbitrary 
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vertices v, w define 

peic{v,'w) = inf{t : v and w in same component of Gt }■ (1) 
For an edge e = {v,w) of G write perc(e) = peTc{v,w) < len(e) and also define the excess 

exc(e) = len(e) — perc(e) > 0. 



Lemma 1 Suppose all the edge-lengths in G are distinct. 

(a) There is a unique MST, say T, and it is specified by the criterion 

e if and only i/exc(e) = 0. 

(b) For any vertices v,w 

peTc{v,w) = max{len(e) : e on path from v to w in T}. 

1.2 The heuristic argument 

Given a probability model for n random points and their interpoint lengths, define a measure 
on (0,oo) in terms of the expectation 

/u„(0, x) = — E |{ edges e : < len(e) — perc(e) < x }| . 
n 

For any reasonable model with suitable scaling of edge-lengths we expect an n — s- oo limit 
measure /i(-), with a density /^(x) = d^/dx having a non-zero limit //^(O"'") as x J. 0. 

Now modify the MST by adding an edge e with len(e)— perc(e) = b, for some small b, to create 
a cycle; then delete the longest edge e' 7^ e of that cycle, which necessarily has len(e') = perc(e). 
This gives a spanning tree containing exactly one edge not in the MST and having length greater 
by b. Repeat this procedure with every edge e for which < len(e) — perc(e) < /3, for some 
small p. For large n, the number of such edges should be n//n(0, /?) ~ n ff_i{0'^)P to first order in 
P, and assuming there is negligible overlap between cycles, each of the new edges will increase 
the tree length by ~ /?/2 on average. So we expect (Lemma [6]) 

6{P) ~ ^(0+)/?, ~ ^(0+)/?V2. 

This construction should yield essentially the minimum value of e for given 5, so we expect 

and in particular we expect the scaling exponent to be 2. 

1.3 Results 

Our goal is to formalize the argument above in the context of the following two probability 
models for n random points. Fix dimension d >2 (the case d = 1 is of course rather special). 

Model 1 The disordered lattice. Start with the discrete d-dimensional cube = [1,2,..., m]'^, 
so there are n = mf^ vertices and there are 2d edges at each non-boundary vertex. Then take 
the edge- lengths to be i.i.d. random variables whose common distribution ^ has finite mean 
and some bounded continuous density function /^(•). 
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Model 2 Random Euclidean. Take the continuum d-dimensional cube [0, n^/'^]'^ of volume n. 
Put down n independent uniformly distributed random points in this cube. Take the complete 
graph on these n vertices, with Euclidean distance as edge-lengths. 

The results of this paper will remain valid in a slightly more general framework than Model 
2 in which points are put down independently at random in the cube [0, n^/'^]'^ with common 
density /(n^/'^x) on M*^, with / having support on [0, 1]*^ and being bounded away from zero. 
To avoid technicalities, we restrict ourselves to the case / constant. 

Each model is set up so that nearest-neighbor distances are order 1 and the MST r„ has 
mean length of order n. To formalize the ideas in the introduction we define the random variable 

• /len(r^) -len(r„) , \ 
Enid) := mm <^ : |r„ \ T„| > dn '> (3) 

where the minimum is over spanning trees and where \ Tn is the set of edges in T'^ but 
not in Tn- 

Theorem 2 In either model, we have 

(a) limsup5~^ limsupEe„(5) < oo, 

and, 

ih) hm inf S'"^ lim inf Ee„ (5) > 0. 

<5iO n 

Structure of the paper In Section [21 we do calculations in the finite models: we prove 
Theorem [2] for Model 1 and part (a) of the theorem for Model 2. In Section [3l we introduce the 
limit infinite random network (limit in the sense of local weak convergence [4J ) and its associated 
minimal spanning forest. We show how results from continuum percolation theory allow us to 
show part (b) of Theorem [5] for Model 2. 



2 Proofs for the finite network 

2.1 The upper bound: Model 1 with d = 2 

We first consider Model 1 with d = 2 and then consider the other cases. 

The upper bound rests upon a simple construction of near-minimal spanning trees, illustrated 
in Figure 1. 

The figure illustrates a particular kind of configuration. There is a 4-cycle of edges abed where, 
for some x, 

len(a) = x, len(6) G + len(c) < x, len((i) < x 

and where the eight other edges touching the cycle have lengths > x + 5. With such a configu- 
ration (within a larger configuration on C^), edges adc are in the MST, and edge b is not. We 
can modify the minimal spanning tree by removing edge a and adding edge 6; this creates a new 
spanning tree whose extra length equals len(6) — x. 

Thus given a realization of the edge-lengths on the m x m discrete square, partition the 
square into adjacent 3x3 regions; on each region where the configuration is as in Figure 1, 
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Figure 1: A special configuration on the 3x3 grid. 

make the modification above. This changes the MST r„ into a certain near-minimal spanning 
tree T^. On each 3x3 square, the probability of seeing the Figure 1 configuration equals 

q{S):= / f{x){F{x + 6)-F{x))F^{x){l-F{x + 5)f dx. 
Jo 

Here / and F are the density and distribution functions of edge-lengths. And the (unconditioned) 
increase in edge-length of spanning tree caused by the possible modification equals 

fix) i^J [y - x)f{y)dyj F\x){1 - F{x + 6)f dx. 

Letting n — > oo with fixed 5^ and using the weak law of large numbers, 

n-i|T;\r„| ^ lq{5) (4) 
n-Hlen(r^) - len(TO) ^ lr{5). (5) 

Because we defined en(') in terms of spanning trees which differ from the MST by a non-random 
proportion of edges, we need a detour to handle expectations over events of asymptotically zero 
probability. We defer the proof. 

Lemma 3 (a) For any sequence T* of spanning trees, the sequence n~^len(Z!;^) is uniformly 
integrahle. 

(h) There exist spanning trees T" such that 

\Fn \ T^nl > On 

where an/n 1/2. 

Now consider the spanning tree T* defined to be if n-^T;^ \Tn\ > j^qiS) and to be if 
not. It follows from (j4|5p and Lemma [3] that 

n~^\T* \ Tn\> jQqiS) (for large n) 

limsupn-iE(len(T^) -len(T„)) < ir((5). 

n 

Then from the definitions of q{6),r{6) and the assumption that /(■) is bounded it is easy to 
check 

q{6) ~ c6, r{6) ~ ^6q{S) as 5 j (6) 
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Figure 2: A special configuration on the 3x3 square. 

for a certain < c < oo. Tliis establishes the upper bound (a) in Theorem [21 

Proof of Lemma [31 Part (a) is automatic because, writing for the sum over all edges 
of C^, the sequence n~^'^^^(, is uniformly integrable. For (b), note that the cube with 
2m{m — 1) edges can be regarded as a subgraph of the discrete torus with 2mP edges. Take a 
uniform random spanning tree T„ on Z^, delete edges not in and add back boundary edges 
to make some (non-uniform) random spanning tree T„ on C^. By symmetry of the torus we have 
P(e G Tn) = for each edge e of the torus, and it follows that P(e € T„) = for each 

non-boundary edge of the cube. Since there are 4(m — 1) boundary edges and 2(m — l)(m — 2) 
non-boundary edges, for any spanning tree t we have 

IE|T„nt| < 4(m - 1) + {n-l){m^ - l)/{2m^) = 4(n^/2 - 1) + (n - lf/{2n). 

So 

E|T„\t| = (n - 1) -E|T„ nt| 

> an := (n - 1) - 4(n^/2 - 1) - (n - lf/{2n). 

So for any spanning tree t there exists some spanning tree t* such that |t* \ t| > Cn. Applying 
this fact to the MST gives (b). 

2.2 Upper bound: other cases 

The argument for Model 1 in the case d > 3 involves only very minor modifications of the proof 
above, so we turn to Model 2 with d = 2 (the case d > 3 is similar). Here it is natural to consider 
a different notion of special configuration. 

Here there is a 3 x 3 square containing a concentric 1x1 square. There are three points within 
the larger square, all being inside the smaller square. In the triangle abc formed by the three 
points, writing x for the length of the second longest edge length, the length of the longest edge 
is in the interval {x^x + 5), and x + 5 < 1. For such a configuration (within a configuration on 
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a m X m square containing the 3x3 square), edges ac are in the MST, and edge b is not. We 
can modify the minimal spanning tree by removing edge a and adding edge b; this creates a new 
spanning tree whose extra length equals len(6) — x. 

We now repeat the argument from the previous section, and the overall logic is the same. 
One gets different formulas for q{6),r{6) but they have the same relationship ([6]). The weak law 
(I4|5p is easily established. The only non-trivial difference is that we need to replace the technical 
Lemma [3] by the following technical lemma. 

Lemma 4 (a) There exists ci such that for any n and any configuration on n points in the 
square of area n, the MST Tn has len(T„) < cin. 

(b) For sufficiently large n, there exist spanning trees T" such that len{T'') < 12cin and 

n-i|r;'\r„|>i. 

Proof. Part (a) follows from the analogous result for TSP - see |TT] inequality (2.14). For (b), 
let ^,1, . . . ,S,n be the positions of the n random points and recall that Tn is their MST. Classify 
these points as "odd" or "even" according to whether the number of edges in the path inside 
Tn from to is odd or even. Let (^j) be a configuration obtained from (,^j) by moving each 
"odd" point a distance llci in some arbitrary direction. Let T„ be the MST on (^j). Let T^' be 
the spanning tree on (^j) defined by 

te,e,)GT;'iff te,4)Gf:„. 

Suppose is an edge of both T„ and T". Since one end- vertex is odd and the other is even, 

it is easy to see: 

either (i) len(^j,^j) > 5ci; or (ii) len(^j,^j) > 5ci. 

But by part (a) there are at most n/5 edges satisfying (i), and similarly for (ii). So |T„ n T^'| < 
2n/5. Noting that 

len(r;;') < llci(n - 1) + len(f„) < 12cin 
using (a), we have established (b). 

2.3 The lower bound: a discrete lemma 

The lower bound argument rests upon the following simple lemma. 

Lemma 5 Consider a finite connected network with distinct edge-lengths. If T is the MST and 
T' is any ST then 

len(r') - len(r) > ^ exc(e'). 

e'eT'\T 

Proof. Suppose |r' \T| = A; > 1. It is enough to show that there exist e' G T' \T and e £ T\T' 
such that 

(i) T* = T'\ {e'} U {e} is a ST; 

(ii) |r*\r| = k-i; 

(iii) len(e') — len(e) > exc(e') 

for then we can continue inductively. 

To prove this we first choose an arbitrary e' G T' \ T. Consider T' \ {e'}. This is a two- 
component forest; so the path in T linking the end- vertices of e' must contain some edge e G T\T' 
which links these two components. So choose some such edge e. Properties (i) and (ii) are clear. 
Apply Lemma [U (b) to the end- vertices of e' to see that perc(e') > len(e). So 

len(e') — len(e) > len(e') — perc(e') = exc(e') 
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which is (iii). 

We will also need the following integration lemma; part (a) will be used for Model 1 and 
part (b) for Model 2. 

Lemma 6 (a) Let ^ and W he independent real-valued random variables such that ^ has a 
density function hounded hy a constant h. Then for any event A C > W} we have 

E(e-vF)iA> 

(h) Let {Vi,i > 1) be real-valued r.v. 's such that /i(0, x) := E l(o<Vi<x) satisfies limsup^^Q M^£) < 
oo. Then there exists a function g{s) ~ I3s'^ as s I 0, for some (3 > 0, such that for any sequence 
of events Ai C {Vi > 0}, 



Proof, (a) It is sufficient to prove 

E[{^-W)lA\W]>^^mP^a.s., (7) 
then by Jensen's inequality, we get 

Eim - w)ia\w]] > > 

Since and W are independent of each other, equation ([7]) reduces to 

n^lA] > for AC{^> 0}. 

We can couple ^ to a r.v. U such that 

(i) ?7 = on < 0}. 

(ii) U has constant density b on (0,P(.^ > 0)/6). 

(iii) U <c 

Now it suffices to prove 

HUIa] > for A<Z{U> 0}. (8) 

But it is clear that, for a given value of P(A), the choice of A C {[/ > 0} that minimizes E[C/ 1a] 
is of the form Ac := {0 < U < c} for some c < > 0)/6. A brief calculation gives 

establishing ([8]). 

(b) For small s > define g{s) by 

f-c{s) 

g{s) = / xfj.{dx) where fi{0,c{s)) = s. 



By hypothesis there exists 7 > such that c(s) > 7s for small s. So 

Ms) 

g{s) > / xfiidx) > c(s/2) x s/2 = -fs^/A. 

Jc{s/2) 

Taking Af := {0 <Vi < c(s)} we have 

^P(A|) = MO,c(s)) =s; Ej2VilAt = p\^{dx)=g{s). 

This is clearly the choice of {Ai) which minmizes the left side subject to X^jlP(^i) = s, and so 
for arbitrary {Ai) we have 



¥.Y,V^^A,>g(^nA^)^ 



8 



2.4 The lower bound in Model 1 



We treat the case d = 2, but d > 3 involves only minor changes. Recall = G^"^ has 
Cn '■= 2{n — n^/^) edges. Fix 6 > 0. Consider a pair (T^,T„) attaining the minimum in the 
definition ([3]) of En (5). For a uniform random edge e„ of C^, 

KIT' \ T I () 

p(e.Er;\r„) = ^l'"\'"' >^ (9) 

and 

Ee„(5) = iE(len(r^) - len(r„)) 

> exc(e) by Lemma O 

= Eexc(e„)l(e„eT^\r„) (10) 

For a fixed edge e of we can write 

exc(e) = (e(")(e) -Ty('^)(e))+ 

where ^("'^(e) is the edge-length of e = {v^v*) and where 

W^^\e) = inf{t : v and t;* in the same component of G^""^ \ {e}}. 

Note (and this is the key special feature that makes Model 1 easy to study) that ^^"^(e) and 
W^'^\e) are independent. Since exc(e) > on {e S \ T„} we see that the quantity at (fTOl) is 
of the form appearing in Lemma [6ja). So 



Esn(5) > E(c(")(e„)-t^W(e„))l(e„eTA\T„)by([IO 



n 

3)2 



> ^i^IilSillil by Lemmata) 

> g^by® 

where / is the bound on the density of ^. Because Cn ~ 2?7- we have established part (b) of 
Theorem [2] in this case. 



3 The minimum spanning forest and continuum percolation 

It remains to prove the lower bound in Model 2. Rather than doing calculations with the finite 
model, we consider the limit Poisson process on the plane, and exploit the well known connection 
between the minimum spanning forest (MSF) and continuum percolation. We then relate the 
finite models to the infinite limits in section [33} as an instance of local weak convergence [4J of 
random graphical structures. 



3.1 Minimum spanning forests 

Here is a general definition, in the context of a countable-vertex network G with distinct edge- 
lengths (see [5J for more detailed treatment). As in Section 11.11 let Gt be the subnetwork 
consisting of those edges e of G with len(e) < t. Define the MSF by: 
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an edge {v, w) is in the MSF if and only if, for t = len(t', w), vertices v and w are in 
different components of Gt and at least one of these components is finite. 

Consider a Poisson point process $ = (5.^. of rate 1 in M'^. Add an extra point O at the 
origin. Consider = (5,,. + 6o as the vertices of a network G (the complete graph with 
Euclidean edge- lengths). With probability one, has only finitely many points in any bounded 
subset of R"^ and all of the interpoint distances are distinct. As in Section II. H we define for 
arbitrary points rji and rjj of , 

peic{r]i,r]j) = inf{t : r]i and r]j are in the same component of Gt}- 

We now give some properties of the MSF denoted Too on this network and show how Lemma 
[T] extends to this setting. 

Lemma 7 (a) We have e £ J-qq if and only z/len(e) = perc(e). 

(h)For any vertex-pair u,v write u ^ v for the set of paths n from u to v. Then, a.s. 

perc(n, = min max{len(e) : e G vr} (11) 

■k:u — 

Proof. Let us say that G has the uniqueness property if for every vertex-pair u,v € G, the 
graph Gien(u^^) has at most one infinite component (note that this notion was used in the proof 
of Lemma 2.1 in [12j). Part (a) will follow from the fact that has the uniqueness property, 
which implies: 

e = {v, w) G JToo ^ V and w are in different components of Gien(e) 
^ perc(e) > len(e) 
^ perc(e) = len(e). 

To show that has the uniqueness property almost surely it is enough to show 

P(Vii G has at most one infinite component) = 1. (12) 

This last fact follows from Theorem 1.8 (and Remark 1.10) of [6j, which implies (see also [5]), 

P(Gj includes at most one infinite component for each t G M) = 1. (13) 

Note that (1121) can be proved without appealing to the simultaneous uniqueness result as follows: 

P(Vu G has at most one infinite component) 

= lim P(Vu G n B{n), Gic^(o u) has at most one infinite component) 

> lim P(Vu G n B{n), Gien(o u) \ B{n) has at most one infinite component), 

where B{n) is the ball of center the origin and radius n and for any network G on M'^, G\ B{n) 
is the subnetwork with edges and vertices in M.'^\B{n). By independence and the fact that there 
can be at most one infinite component in continuum percolation (see Theorem 3.6 in [lOj), we 
have 

P(Vu G <I>*^ n B{n), Gicn(o,u) \ B{n) has at most one infinite component |<I>'^ n B{n)) = 1 a.s. 
which proves ()12p . 
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We now prove (b). Let t = perc(n, f), the definition of peic{u,v) may be restated easily as: 
t = perc(n, v) = inf max{len(e) : e G vr} 

n:u — >v 

Hence (b) amounts to prove tliat with probabihty one, this infimum is indeed a minimum. 
Note that Gt{u) D Gt{v) = and by (fT3l) a.s. at least one of these two clusters, say Gt{u), is 
finite. Let E be the set of edges with exactly one of its end vertices in Gt{u) and the other 
one in Gt{uY, and with edge length less than t + 1. The set E is a.s. finite and then we 
easily see that min{len(e),e G E} = t = perc(ii, u) since u and v are in the same component 
of Gt+e for any e > 0. Let e* = argmin{len(e), e G E} and write e* = (a, 6) with a G Gt{u) 
and h G Gt{uY. Since a.s. we have len(e) ^ len(e*) = t for any e ^ e* , a.s. we have 
Gt+{u) := n,>oGt+e(u) = Gt{u) U {e*} U Gt{b) and Gt+{v) = Gt{v). But the definition of 
t = peic{u,v) implies that Gt+{u) = Gt+{v), and hence b G Gt{v). It follows that 

inf max{len(e) : e G vr} = len(e*) = min max{len(e) : e G vr}, 

TT-.u — >v tt:u — 

and (b) follows. 

3.2 Finite density 

Define the measure ^ on (0, +oo) by 

^(0,x) = 1(0 < len(0,r?i) - perc(0,r/i) < x) 

i 

= E^1(0 < len(0,?7i) -perc(0,?7i) < x). 

i 

The next lemma formalizes the heuristic idea ffj,{0~^) < oo from section [L2l 
Proposition 8 In Model 2, we have, 

/i(0,x) 
limsup < oo. 

For (Xi, • • • G (M'^)" we define = $ + X;r=i (^x,, and write P^i. -'^" for the 

probability measure associated with the random variable <I)^i' ' '^". Using Campbell's formula, 
we have 

poo 

//(0,x) = UJd P*^'-(perc(0,t) G [t - x,t))t'^"^(it, 
Jo 

where t is the point (t, 0, . . . , 0) and uJd = rld/2) surface of the unit sphere. 

We need to introduce some continuum percolation terminology. For any r and A, we define 
the probability measure P^'* under which <I> is a Poisson point process of intensity 1 and an edge 
e from the complete graph is said to be open (resp. closed) if len(e) < r (resp. len(e) > r). 
We denote by G'-' the open cluster containing the origin: G*^ = G^. Let Vc be the critical 
radius for the Poisson continuum percolation model of density 1 and deterministic radius, i.e. 
for r < Vc the number of vertices in any open cluster is finite whereas for r > Vc there exists an 
unique unbounded open cluster. 

Write C, Ci, C2 for positive constants not depending on the parameters of the problem. 
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Lemma 9 For any e > 0, we have 



for < t < rc — e 
for t > rc + e 



P'^'^(perc(0,t) e[t- x,t)) < Cix, 
P'^'^(perc(0,t) G [t-x,t)) < Cixe 



-Cat 



We first introduce some notations. The edge-length is the Euclidean distance denoted len(u, v) = 
\u—v\. For a set 5 C W^, we denote by d{S) = sup{|rE — ?/|, x, y € 5} its diameter. For x and 
r > 0, B{x, r) denotes the open ball of radius r centered at x. For t > we denote S{t) = [—t, tY- 
Under the probability measure P?'-, the occupied region is [Jxe^o,tB{X,r /2) and the vacant 
region is the complement of the occupied region. The occupied component of the origin W is 
defined by P^'-(l^ = \Jx^G'^B{X,r /2)) = 1. The vacant component containing the point t/2 is 
denoted by V. More generally, for r > the occupied region at level r is Ux^^o,tB{X,r /2) and 
we denote Wr = Uxi^G'^B{X,r/2) the occupied component of the origin at level r and Vr the 
vacant component containing the point t/2 at level r. 

Since we may assume that all interdistances are different, there exists an unique pair (X, Y) 
in the support of <I>'^'^ such that perc(0,t) = |X — y| (see Lemma[7D. 

First consider the case t < r^. — e. Let Sz = zt/2 -\- S{t/2) where z € Z'^. If the event 
{perc(0,t) € [i — x,t)} occurs, there is some z G Z'^ such that fl 5^ / and there exists 
X,Y e ^ n Sz such that \X -Y\ e [t - x, t). Note that we have for any z G Z'^, 



where ||2;|| := max(|zi|, \z2\) and X is a constant depending on d. Lemma 3.3 of [lOj ensures 
that the sum of (jl4p is finite for t < rc — e. 

The case t > rc + e is quite similar. If the event {perc(0, t) £ [t — x, t)} occurs, there is some 
z G such that Vt-x n 7^ and there exists X,Y e ^ Ci Sz such that \X -Y\ G [t - x,t). 
Hence we have 



pf '- {3X, Y e<^nSz, \x -Y\e[t-x, t)) < C{i + t^'^-^)x. 



Hence we have 



P^'^(perc(0,t) G [t-x,t)) 

< ^?'-{wnSz^9, 3X,y G <i>{Sz), \x -Y\e [t-x,t)) 




(14) 



FOi{peTc{0,t)e[t-x,t)) < 5+ J] {Vt-. n Sz ^ (H) ] C{1 + t^"-') 




X 



< Cie'^'^^x, 



(15) 



where (|15p follows from Lemma 4.1 of [10] and the fact that 



Ff''- (Vt^x f\Sz^%)< Pf {d{Vt^x) >\\z- t/2\\) . 



We now concentrate on the case t G (rc — e, rc + e). We define the event 



A = {the points of <I> on the axis ei are in G }, where ei = (1, 0, . . . , 0). 
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Under P?'-, with probability one, we have A = {t& C^} and 



P°'^(perc(0,t) e[t-x,t)) 
We first prove that 



pf -i(^) _ FfJ^iA) + Ff^iA) - pf4(A). (16) 



(17) 



Note that 



f_,(i?(i,t-x)nGO^0) 

"^^(^( t-3; ,t-a;)nG^ / 0), 



where B{X,r) denotes the open ball of radius r > centered at X E W^. Hence we have 
"tl^iA) - Pf4-(^) < ^{^{B{t_, t - x)AB( t-x , t-x))>l)< Cf^x, 



where B{t, t)AB( t — x ,t — x) denotes the symmetric difference. This is exactly (I17p . 
We then write: 



i>o,t 



(perc(0,t) G [t - x,t))t'^''^dt 



Pf'-(A)t^~i(it 



1 



1 

+ - 

X 



^0,t-x 



r::^iA)-Ffi^{A)r-'dt (is) 



-f^{A)t'-'dt 



d-i. 



With the change of variable t i— > t — x, the second term on the right hand side of p8|) is 
decomposed as follows: 



tr^(A)t'^-idt < i 



rc+s—x 



Ff'-{A)t'^-^dt + K 



Tc—e—x 



Tc+e-x 



rc-e-x 



{A)e-'dt, 



where ii' is a constant depending on d. 

Hence, the decomposition (|18p is further decomposed as 



Tc+C 



l>0,t 



(perc(0,t) €[t- x,t))t'^~^dt 



< 



1 



Tc + t — X 

Tc+e—x 



pf •-(A)t'^-^dt + - 

^ X 



Ff'-{A)t'^-^dt 



+K 

1 



Pf'-(A)t^-2dt 



+ - 



X 



Tc-e-x 



-fJ^iA)-F?_iiA))t'-'dt 



By ([IT]), the last term is bounded by /J'^jJ^' Ct^'^~^dt = Ci. It implies that 



T>0,t 



(perc(0,t) G [t - x,t))t'^~'^dt 



< Co. 



(19) 



Proposition [8] now follows from Lemma [9] and Equation (jlOp . 
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3.3 The lower bound in Model 2 



We start with a slight extension of Proposition 9 of (see also Theorem 7 in [4]). In what 
follows, a set of points is identified with its associated geometric graph which is the complete 
graph over these points with Euclidean distance as edge-lengths. 

Lemma 10 Let denote the point process consisting of n points 1 < i < n} which are 
independent and have the uniform distribution on the square [0,n^^'^]'^. For each n, let C/„ be 
chosen independently and uniformly from the set {1, . . . ,n}, and let 

€ = {ef^ ■.= ^^-^u^,l<i<n}. 

To each vertex ^j-"'' of the rooted (at the origin) geometric graph <I>^, we associate the mark 
percf = perc(0, ^j^"^) as defined in (Op. We denote by {^^,pevc^) = {^|'^\ perc"} the corre- 
sponding marked geometric graph. Then one has joint weak convergence 

((cDO perc"),M5T(cI>0)) S perc), .^oo), (20) 

where ($'^, perc) is the Palm version of the Poisson process of intensity 1 with the mark perc(0, rji) 
associated to point rji. 

Here convergence MST{^^) T^o is local weak convergence in the sense of [1]. 

Proof. The analog of (j20p without marks is Proposition 9 of [3j. By the Skorokhod repre- 
sentation theorem, we can assume that with probability one, we have 

(^^? = {it^}.MST{^0)) - ($0 = (21) 

We have to prove that for any i > 1, 

lim perc(0,^-"^) =perc(0,?7j) a.s. 

n— »oo 

By Lemma [71 we know that perc(0,ryi) = max{len(e), e G tt*} where vr* is the minimax path 
from O to r]i. By definition of the metric of local weak convergence, ()2ip implies that for 
arbitrary fixed L, we have with S{L) = [—L^L]'^, 

^r,,(^S{L), 

For L sufficiently large, the path vr* is included in S{L) and let vr* be the associated path in 
Since 

perc(0, i^-"^) = min maxlen(e) < maxlen(e), 

by the convergence of vr* to vr* , we have 

limsup perc(0, ^1"'*) < max{len(e), e S vr*} = perc(0,r/j). 

n— >oo 

Now we need to prove that 

liminf perc(0, ^f"^) > perc(0, r/j). (22) 

n — ^00 

Take vr^") : O ^ ^J"^ such that 

max len(e) = perc(0, ^1"'*). 
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For r > 0, we denote by Gr{r]i) (resp. G"(q" )) the connected component of <I>*^ (resp. 

with edge length less than r containing rji (resp. Cj-"^)- Let perc(0,r7j) = t, so that we have 
Gt{0) n Gt{r]i) = and say Gt{0) is finite (see the uniqueness property in the proof of Lemma 
[7]). We define Gt (resp. G") to be the subgraph of (resp. consisting of those edges 

with length less than t + \ with exactly one of its end vertices in Gt{0) (resp. G'^{0)). Let 
e* = argmax{len(e), e G vr*}. By Lemma[71 we know that perc(0,?7i) = len(e*) and e* € Gt is 
such that e* = argmin{len(e), e G Gt}- Since Gt{0) is finite, we have clearly that Gt is included 
in S'(L) for sufficiently large L. Then we have 

max len(e) > min{len(e), e S G"} — t = perc(0,?7i) as n ^ oo, 

eG7r(") 

where the last limit follows from the convergence of G" to Gt- 

We now return to the proof of the lower bound in Model 2. We start by copying and 
modifying the argument from section \2A[ Fix 6 > 0- Let S,u„ be a uniform random vertex from 
(^i; 1 < i < n)- Consider a pair (T^,Tn) attaining the minimum in the definition ([3]) of £ni^)- 
Then 

EY,mu^,^^) e \ Tn} = > 26 (23) 

i 

and 

Ee„(5) = iE(len(r;) - len(T„)) 

> exc(e) by Lemma [5] 

= iE^exc(ec7„,e,)l{(ec/„,e.)e7^:\r„} (24) 

i 

Note that for < L < oo 

E^l{len(ef/„,e.)>A(ei/„,e^)G7;;} = -E|{eGr^len(e)>L}| 

i 

^ 2 E len(r^) 
~ n L 

< 6 for L = L{5) sufficiently large 
the last inequality because Elen(T^) = 0{n). So fixing such an L, (f23]l implies 

E ' ^ \ len(Cc/„, e^) <L}>5 (25) 

while ()24p trivially implies 

2Ee„(5) > E J^exc(ec/„,ei)l{(ec/„,e*) G \ r„, ,len(ec/„,?i) < L}. (26) 

i 

The purpose of these representations is to exploit local weak convergence. Consider the near- 
minimal STs appearing in (j25|26p . By a compactness argument and by passing to a sub- 
sequence of n we may assume that they converge to some forest !F'^ on {fji); that is, we may 
assume that (pO]) remains true when we append to the left side and F'^ to the right side. 
We can now take limits in (j25p to deduce 

Y,n{0.m) G -^'oo \ -^oo, len(0,r?,) < L) > 5- 
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And taking limits in (|26p gives 

21iminfE£„(5) >EV(len(0,77i) -perc(0,r?i))l{(0,^i) e-^'oo\-^oo, len{0 , tj,) < L} . (27) 

i 

Writing 

Vi = len(0, r]i) - perc(0, rji) 

A, = {{O, rji) G \ ^oo, len(0, r?,) < L)}, 

we are precisely in the setting in which Proposition [8] and Lemma[6)^b) apply, and the conclusion 
is that the right side of (|27p is > (/? — o(l))(5^ for small 5, implying the lower bound in Theorem 

m 
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